In the PM2.5 Map, PM2.5 is measured in “annual mean concentration of PM2.5 (weighted average of measured monitor concentrations and satellite observations, μg/m3), over three years (2015 to 2017)”. We can see that the census tracts with higher annual concentration of PM2.5 are areas east of the San Francisco bay and center to east of the entire bay area. Meanwhile, areas at the north and south borders experience less PM2.5 concentration.
In the Asthma Map, asthma is measured in “spatially modeled, age-adjusted rate of ED visits for asthma per 10,000 (averaged over 2015-2017)”. We can see that areas to the east of San Francisco bay has higher asthma prevalence, as well as places to the northeast of the entire bay area, such as Vallejo and Antioch.
At this stage, the best-fit line is representative of the data set, since the points are mostly distributed along the line, even though there are a few that wandered off.
##
## Call:
## lm(formula = Asthma ~ PM2.5, data = bay_pm25_asthma_tract)
##
## Residuals:
## Min 1Q Median 3Q Max
## -54.47 -25.89 -9.61 12.94 182.95
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -116.278 13.040 -8.917 <2e-16 ***
## PM2.5 19.862 1.534 12.950 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 37.49 on 1578 degrees of freedom
## (1 observation deleted due to missingness)
## Multiple R-squared: 0.09606, Adjusted R-squared: 0.09549
## F-statistic: 167.7 on 1 and 1578 DF, p-value: < 2.2e-16
An increase of 1 in PM2.5 is associated with an increase of 19.862 in Asthma. P value here is very close to zero, so we reject null hypothesis and consider this regression statistically significant. 96% of the variation in PM2.5 is explained by the variation in Asthma.
Residual distribution is significantly skewed to the left.
##
## Call:
## lm(formula = log(Asthma) ~ PM2.5, data = bay_pm25_asthma_tract)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.00402 -0.46479 0.03313 0.42298 1.75525
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.69234 0.22840 3.031 0.00248 **
## PM2.5 0.35633 0.02686 13.264 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.6566 on 1578 degrees of freedom
## (1 observation deleted due to missingness)
## Multiple R-squared: 0.1003, Adjusted R-squared: 0.09974
## F-statistic: 175.9 on 1 and 1578 DF, p-value: < 2.2e-16
This time, the mean of residual is very close to zero and this time it yields a more normal distribution .